Parallel database sorting

نویسندگان

  • David Taniar
  • J. Wenny Rahayu
چکیده

Sorting in database processing is frequently required through the use of Order By and Distinct clauses in SQL. Sorting is also widely known in computer science community at large. Sorting in general covers internal and external sorting. Past published work has extensively focused on external sorting on uni-processors (serial external sorting), and internal sorting on multi-processors (parallel internal sorting). External sorting on multi-processors (parallel external sorting) has received surprisingly little attention; furthermore, the way current parallel database systems do sorting is far from optimal in many scenarios. In this paper, we present a taxonomy for parallel sorting in parallel database systems, which covers five sorting methods: namely parallel merge-all sort, parallel binary-merge sort, parallel redistribution binary-merge sort, parallel redistribution merge-all sort, and parallel partitioned sort. The first two methods are previously proposed approaches to parallel external sorting which have been adopted as status quo of parallel database sorting, whereas the latter three methods which are based on redistribution and repartitioning are new that have not been discussed in the literature of parallel external sorting. Performance of these five methods is investigated and the results are reported. 2002 Elsevier Science Inc. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sorting in Parallel Database Systems

Sorting in database processing is frequently required through the use of Order By and Distinct clauses in SQL. Sorting is also widely known in computer science community at large. Sorting in general covers internal and external sorting. Past published work has extensively focused on external sorting on uni-processors (serial external sorting), and internal sorting on multiprocessors (parallel i...

متن کامل

Tuning a Parallel Database Algorithm on a Shared-memory Multiprocessor

Database query processing can benefit significantly from parallelism. Parallel database algorithms combine substantial CPU and I/O activity, memory requirements, and massive data exchange between processes, all of which must he considered to obtain optimal performance. Since parallel external sorting is a very typical example, we have focused on sorting to tune Volcano, a new query processing s...

متن کامل

External Sorting for Databases in Distributed Heterogeneous Systems

A common approach to external parallel sorting in parallel database query processing is to split the data of initial runs into partitions. These partitions are assigned statically to the processes of the merge phase to produce a globally sorted result. This strategy may lead to low performance if some processes are overloaded caused by data skew or load imbalances. In this paper we describe a n...

متن کامل

A Fast, Storage-E cient Parallel Sorting Algorithm

A parallel sorting algorithm is presented for storage-e cient internal sorting on MIMD machines. The algorithm rst sorts the elements within each node using a serial sorting algorithm, then uses a two-phase parallel merge. The algorithm is comparisonbased and requires additional storage of order the square root of the number of elements in each node. Performance of the algorithm on two general-...

متن کامل

Parallel Sorting on a Shared-Nothing Architecture using Probabilistic Splitting

We consider the problem of external sorting in a shared-nothing multiprocessor. A critical step in the algorithms we consider is to determine the range of sort keys to be handled by each processor. We consider two techniques for determining these ranges of sort keys: exact splitting, using a parallel version of the algorithm proposed by Iyer, Ricard, and Varman; and probabilistic splitting, whi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Inf. Sci.

دوره 146  شماره 

صفحات  -

تاریخ انتشار 2002